4973. Mutation

 

Genetic scientists of the planet Olympia are again experimenting with the DNA of primitive organisms. The genome of an organism is a sequence of genes, each of which can be encoded with one positive integer. Genes that are encoded by the same numbers are considered the same, and vice versa, genes that are encoded by different numbers are considered different.

Scientists have already deduced some primitive organism and want to modify its genome in such a way as to obtain an ideal organism. They believe that in the future it will help to find a cure for many diseases.

An organism is considered ideal if any two identical genes are either located in adjacent positions in the genome, or there is at least one the same gene between them.

In one operation, scientists can select and remove one or more identical genes from the body's genome, after which they can be inserted back into the genome, but perhaps in different positions. Since each such operation weakens the body, scientists want to achieve their goal, while performing as few operations as possible.

Write a program that, according to a given representation of the genome, will determine the smallest number of operations necessary to obtain an ideal organism.

 

Input. The first line contains the number of genes n (1 ≤ n ≤ 105) in the genome of the primitive organism. The next line contains n positive integer, each does not exceed n – the sequence of genes in the genome.

 

Output. Print the least number of operations for which scientists can get a perfect organism.

 

Sample input

Sample output

9

1 2 1 3 1 3 2 4 5

2

 

 

SOLUTION

greedy

 

Algorithm analysis

The genes should be rearranged in such a way that the same genes stand side by side. In this case, the number of operations performed should be minimized.

For each positive integer ai associate the segment [si; ei], where si is the position where ai appears the first time and ei is the position where ai appears the last time.

Let res be the largest possible number of disjoint segments. Find this value using the “order selection” algorithm. This means that each of the other segments intersects with at least one of the selected ones. And over the genes that correspond to them, it is necessary to carry out the described procedure.

If k is the number of all constructed segments, then the answer to the problem will be the value kres.

 

Example

For the given example, the segments will have the form (the positions are numbered starting from zero):

·        for 1: [0; 4];

·        for 2: [1; 6];

·        for 3: [3; 5];

·        for 4: [7; 7];

·        for 5: [8; 8];

Sort the segments by the coordinate of its end:

[0; 4], [3; 5], [1; 6], [7; 7], [8; 8]

The maximum number of non-intersecting line segments is 3. For example, you can select line segments [0; 4], [7; 7], [8; 8]. The procedure described in the problem statement should be carried out over the genes corresponding to the remaining segments. These genes are 2 and 3. Two permutations can, for example, be done as follows:

 

Algorithm realization

Let MAX be the maximum possible number of genomes in the input sequence.

 

#define MAX 100001

 

The vector of pairs v stores the segments [si; ei].

 

vector<pair<int, int> > v;

int s[MAX], e[MAX];

 

Read the input data. Initialize s[i] = -1, e[i] = -1.

 

scanf("%d", &n);

memset(s, -1, sizeof(s));

memset(e, -1, sizeof(e));

 

for (i = 0; i < n; i++)

{

  scanf("%d", &x);

 

If number x occurs for the first time, then its position in the input sequence is stored in s[x].

 

  if (s[x] == -1) s[x] = i;

 

If number x occurs not for the first time, then its position is stored in e[x], assuming that x occurs for the last time.

 

  if (i > e[x]) e[x] = i;

}

 

Store all constructed segments in the vector v.

 

for (i = 1; i <= n; i++) // numbers 1..n

  if (s[i] != -1) v.push_back(make_pair(e[i], s[i]));

 

The segments were inserted to the vector in the reverse order (like [ei; si]), so that default sort function will sort the segments in increasing order of their ends.

 

sort(v.begin(), v.end());

 

Find the maximum number res of disjoint segments.

 

i = res = 0;

while (i < v.size())

{

 

The i-th segment is added to the set of disjoint segments. The variable temp contains the end of this segment (let's call it the current segment).

 

  res++; temp = v[i++].first; // end

 

As long as the start of the next segment is less than the end of the current one, skip such segment (it intersects with the current one).

 

  while (i < v.size() && v[i].second < temp) i++;

}

 

Print the answer.

 

printf("%d\n", v.size() - res);